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Abstract 

Monte-Carlo simulations are routinely used for estimating the scaling exponents 
of complex systems. However, due to finite-size effects, determining the exponent 
values is often difficult and not reliable. Here we present a novel technique of dealing 
the problem of finite-size scaling. The efficiency of the technique is demonstrated 
on two data sets. 

Key words: finite-size scaling, Monte-Carlo, critical phenomena, fractal dimension 
PACS: 05.40.-a, 64.60.an, 64.60.De, 68.35.Ct 



1 Introduction 



Determining the scaling exponents from the finite-size simulation data is a very 
common task in the physics of complex systems. In particular, this technique is 
widely used in the context of phase transitions, surface roughening, turbulence, 
granular media, etc, c.f. reviews [TU21I3] . Typically, the finite size Monte-Carlo 
studies involve extrapolation of the simulation data towards infinity Unless 
there is some theoretical understanding about the functional form of the finite 
size corrections to the asymptotic scaling laws of the particular system, such 
an extrapolation carries a risk of underestimating the uncertainties. In some 
cases, it may be helpful to increase the computation time and system size, 
and optimise the simulation scheme (c.f. [1]). However, this is not always 
feasible, because the convergence to the asymptotic scaling law may be very 
slow. Additional difficulties arise, when one needs to determine the exponents 
of the finite-size correction terms (c.f. [5]), or when the asymptotic power law 
includes a logarithmic pre-factor. 

In what follows, we develop a novel technique of determining scaling exponents 
from the finite size simulation data. Using two different simulation series as ex- 
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amples, we demonstrate its effectiveness. The scaling exponents are found with 
high accuracy, even in the case of an extremely slow asymptotic convergence 
(when a straightforward power law fit is effectively unusable). Furthermore, 
we are able to calculate the exponents of the first and second correction terms 
to the asymptotic scaling law. 



2 Description of the method 

Let us consider a model system, described by its size N, assuming that the 
smallest possible value of N plays the role of the unit length. Suppose that 
the mathematical expectation of a certain physical quantity scales as L oc N a 
for AT ^> 1. For illustration, if our model system is a percolation lattice inside 
a square of side length N at criticality, then the quantity L could be the mass 
of its largest percolation cluster. The Monte-Carlo simulations can be used to 
estimate the values of the mathematical expectation L for several system sizes 



let us denote these estimates as Li, i = 1, . . .n, and the variances of them 
as of. Then, root-mean-square fit can be used to obtain the scaling exponent 
a, c.f. [2]. However, it is often difficult to estimate the uncertainty of the 
obtained result, because the magnitude of the finite-size corrections to the 
asymptotic law L = LqN 010 is unknown. Of course, one can plot ln£j versus 
IniVj and determine such a transition point i = k that for % > k, the data 
points lay within their statistical uncertainties on a straight line. Then, only 
the data points with i > k will be used for finding the exponent ot\. However, 
one can easily underestimate the adequate value of k, because the statistical 
fluctuations just happen to compensate the finite-size corrections AL. On the 
other hand, taking excessively large values of k would inflate the variance a ai 
of the outcome. Finally, in some cases, the decay rate of the corrections AL 
can be very slow (see below), so that the method outlined above will fail at 
the first step — there is no linear range of the graph. 

This problem can be resolved, if it is possible to find more than one physical 
quantity with similar scaling behaviour. Suppose one can define m distinct 
quantities, the mathematical expectations L k (k = 1,2, . . . m) of which obey 
identical scaling exponents (examples will be provided in the next Section). 
The method will work, if the following additional conditions are satisfied. First, 
the mathematical expectations L k can be expanded asymptotically as 



N, < N 2 < ... < N n , 



oc 



L k{N) = A ^kN a ^ k , with a {fl+1)k < a^ k ; 




fj,=i 
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second, the first m leading exponents a M fc,/i = l,...m are equal, so that we 
can designate 



a 



M = a^ k , for fi < m. (2) 



Third, the first m leading terms in the expansion are linearly independent, 
detA^k (with /i,k < m). We shall find out later, how to verify, if these 
conditions are satisfied. 

Now, let us designate the sum of the residual terms as 

oo 

6i(N)= ]T A^N"", 

fj,=m+l 



and introduce B^ k as the mxm inverse matrix of A^ k . Then, the relationship 

m 

J^A^N^ =Li(N)-5 k (N) 



can be rewritten as 

m 

N a » = Y,L k (N)B kfl -A,(N), (3) 
fe=i 



where 



^(N) = J2Sk(N)B k ^ (4) 

k=l 



So, neglecting the residual term A At (iV), the power law N ap - can be expressed 
as a linear combination of the m functions L k (N). Hence, if we perform a 
least-square fit of the n-dimensional vector x d = {Nf, N$, . . . N%) with a linear 
combination of the m data vectors C k = (£ k i, £k2, ■ ■ ■ £>kn), and plot the sum 
of the squared residuals S(d) as a function of d, then there should be m 
minima, at d — ai, a 2 , ■ ■ ■ a m . Here, C ki denotes the Monte-Carlo estimate 
of the expectation L k (N) at N — Ni. Indeed, according to Eq. (j3J), such a 
fit should be possible for the listed values of d [assuming that the statistical 
fluctuations dominate over the residual terms A M (iV)]. So, if the function S(d) 
does, indeed, have m clear minima, then the validity of the above mentioned 
assumptions is confirmed, and the positions of the minima can be used to find 
the values of the exponents for // = 1, . . . m. 
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Let us consider this method in more details. So, we need to find the function 

n I m \ 2 

5(d) = E [Nt-Y.CkC k A s~\ (5) 
1=1 \ k=l ) 

where the constants C k are optimised, yielding the minimal value of the ex- 
pression in right-hand-side; here sf denotes the variance of the expression 
between the braces, 

n 

S; = CkCiT<kii, 
k,l=l 

and Efcjj is the covariance matrix of the Monte-Carlo simulation results for 
Lfc(iV) at iV = Ni. The covariance matrix is needed, because in order to save 
the computation time, it is reasonable to use the same simulation data for all 
the quantities L k . These quantities are probably strongly correlated, because 
they describe similar aspects of the model system. So, a proper statistical 
analysis requires the covariance matrix. Fortunately, it can be easily estimated 
using the same Monte-Carlo simulation data, without noticeable increase in 
computing resources. 

If the weighting factors s^ 2 in Eq. ([5]) were constant, then the problem of 
finding the function S(d) would be a simple linear least-square fitting task. 
Things are slightly more complicated due to the fact that the weighting fac- 
tors depend on the constants However, if the variances sf are calculated 
iteratively (using the constants C k from the previous iteration), then the con- 
vergence is very fast (typically, no more than three iterations are required). 

Notice that at the optima, i.e. for d = a M (// < m), the random variable S(d) 
should have chi-square distribution with n — m — 1 degrees of freedom, if all 
the three above mentioned assumptions are fully satisfied. So, the values at 
the minima, 5 , (o! Al ), can be compared with the critical values Xn-m-i(p) °f the 
chi-square distribution, to further examine the validity of these assumptions. 
If the result is positive [i.e. S(ak) < Xn-m-i(p)]> the same critical values can 
be used to estimate the uncertainties of the results by finding such values A ak 
that S{a k ± A a J w xl- m -i{v)- 

As a final remark, let us notice that one could use the power series ([3]) for a 
direct nonlinear least-square fitting. However, nonlinear fitting by itself is a 
difficult task, if there is a large number of fitting parameters. What is more 
important, a larger number of fitting variables would be needed, to take into 
account the same number of terms in the asymptotic expansion (i.e. to achieve 
the same accuracy). Indeed, here, we have m + 1 fitting parameters (Ck and d). 
A straightforward fit with m first terms in the expansion ([3]) would include 2m 
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fitting parameters (A^k and a M , /x = 1, . . . m). Out of those 2m parameters, 
half are nonlinear ones (ct M ). As a result, it would be practically impossible 
to handle more than two terms in the power series. Besides, more fitting 
parameters results in larger uncertainties of the fitting results. 



3 First example: statistical topography of rough surfaces 

Here we consider the simulations, which were performed to determine the 
fractal dimension of a certain set of contour lines of random Gaussian self- 
affine surfaces. This set of contour lines will be referred to as the "oceanic 
coastline". Providing detailed discussions, why it was necessary to define and 
study the "oceanic coastlines", is beyond the scope of the present paper. In 
what follows, only as much details will be provided, as is needed for illustrating 
our new technique of determining the scaling exponents. 

Let us consider a random surface, which is given by the surface height ip(x, y) = 
ip(r) over a two-dimensional plane. It is assumed that this surface is Gaussian 
and self-affine, characterized by the Hurst exponent H: 



Here, the angular braces denote averaging over different realizations of the 
surface; we assume that < H < 1. 

Further, let the surface be flooded by water up to a level h. Then, regions with 
ip{r) < h will be called "wet". Now, we pick a connected (possibly infinite) 
wet region and name it "ocean" (all the other connected wet regions are called 
"lakes"). The perimeter of the "ocean" is called the "oceanic coastline" 

The fractal dimension of the "oceanic coastline" has been calculated numer- 
ically, using the following method. Instead of isotropic two-dimensional sur- 
faces, 1+1-dimensional [(1+1)D] random surfaces are generated using the lat- 
tice of the four- vertex model [BJ. Such surfaces are assumed to belong to the 
same universality class as the statistically isotropic 2D surfaces, but are nu- 
merically more efficient. So, the surface height is given by 



where fjj and gn are two uncorrelated one- dimensional discretised fractional 
Brownian functions, which take only integer values and satisfy additional con- 
straint 




(6) 



*l>(x,y) = MbJ) -9h([v\), 



f H (j) = f H (j-i)±i, 



(7) 
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Here, [^J denotes the floor function of x. These one- dimensional functions are 
generated using uncorrelated random sequences of "spins" Sj = ±1 (i E N). 
Each "spin" sequence defines an aim function 

Fn{j) = T^\3-i\ H -°-\ (8) 

i=0 



where i,j G N. The aim function is approximated by a function as 
closely as possible under the constraint (JTj). 

The simulations have been performed for square polygons of side length N\ = 
128, N 2 = 196, iV 3 = 256, iV 4 = 392, . . . N 9 = 2048, using different values of 
H. For each realization of the surface, such an "oceanic coastline" has been 
found, which connects a pair of opposite edges of the polygon. This has been 
achieved by simulating the following surface flooding process. The polygon 
boundary is assumed to be impenetrable for the water. The water is injected 
slowly onto the surface at the lowest point of the perimeter of the polygon. 
Water injection is terminated as soon as the flooded region connects a pair of 
opposite edges of the polygon. At the end of such a flooding, three quantities 
were recorded: Li — the coastline length, L 2 — the number of cells (i.e. the 
faces of the square lattice) touching the coastline, and L 3 — the coastline 
length immediately before achieving the critical flood level. Apparently, all 
these quantities have the same asymptotic scaling exponent a%, i.e. oc N ai 
with k = 1, 2 and 3. For each polygon size and Hurst exponent value, M = 10 s 
or more surfaces have been generated. 

In Fig, 1, the strength of finite size effects is demonstrated by plotting the dif- 
ferential fractal dimension, defined as d c = log 2 (£fci/£fci+i), versus the system 
size i/iVjiVj+i. For H = 0.5, these curves seem to suggest that the asymp- 
totic value of the oceanic coastline fractal dimension is d c = 1.419 ± 0.002. 
For H = 0, the finite size effects are so strong that it is impossible to give 
any estimate for d c . In Fig. 2, the curves log 10 [S (a) / (n — 4)] are plotted ver- 
sus the exponent value d. The existence of three sharp minima for all the 
curves confirms the validity of the assumptions required for the applicability 
of our new method. In particular, it is possible to conclude that for H = 0.5, 
d c w 1.4203 ± 0.0009 and for H = 0, d c » 1.8975 ± 0.0025. Also, it is 
possible to determine the values of the second and third exponents of the 
asymptotic expansion. For H = 0.5, these values are a 2 = 0.745 ± 0.015 and 
a 3 = 0.353 ± 0.004; for H = 0, a 2 = 1.563 ± 0.005 and a 3 = 0.721 ± 0.005. 
The uncertainties here have been obtained using the critical coefficients for 
the chi-square distribution for p = 95%. 

Finally, let us study the shape of the curve for H = in Fig. 2 in more details. 
This curve corresponds to n — 4 = 4 degrees of freedom. The last minimum (at 
d = «i) is so deep that the residual term in Eq. is clearly negligible. 
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Fig. 1. The values of the differential fractal dimension of the "oceanic coastline" 
d c = log 2 (£fci/ ' Cki+i) are plotted versus the system size \J NiNi + \ in semilogarith- 
mic graph. Upper three curves correspond to the Hurst exponent H = 0.5, lower 
curves — to H = 0. While upper curves seem to converge around d c « 1.419, no 
convergence can be observed for H = 0. Using the same simulation data, our new 
method yields d c ss 1.4203 ±0.0009 for H = 0.5 and d c ^ 1.8975 ±0.0025 for H = 0. 
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Fig. 2. The logarithm of the sum of squared residuals (reduced to the number of 
degrees of freedom), log w [S(d)/(n — 4)], is plotted versus d for different values of 
the Hurst exponent H (the curves are labeled with numbers indicating the value 
of H). For all the curves, three deep minima are present. The positions of these 
minima allow us to determine the exponents of the asymptotic expansion [Eq. Q]. 

However, the minima at d = «2 and do not pass the test based on the 
chi-square distribution. The difference in depth of the minima is explained by 
two circumstances. First, the residual terms A M can be of different amplitude 
for different values of /z, because the terns in Eq. (j4j) can efficiently cancel 
out (likewise, they can also magnify each other). Second, the variance of the 
linear combination in Eq. ([3l) can also depend considerably on /i, due to the 
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non-diagonal elements of the covariance matrix E^. Therefore, in order to 
determine the values of a 2 and 013, it was necessary to skip respectively one 
and two datapoints (at N = Ni and iV = N 2 ), reducing the number of degrees 
of freedom down to 3 or 2. 



4 Second example: hulls of 1+1-dimensional percolation problem 

Here we consider a simple modification of the two-dimensional percolation 
problem, when the bond breaking process is defined by two functions of one 
variable (hence, we call it 1+1 dimensional). More specifically, we use the 
surrounding lattice, the sites of which are at the middlepoints of the of the 
percolation lattice. Let the x and y axes be defined so that the sites of the 
surrounding lattice have integer coordinates. Further, if we have two functions 
f(x) and g(y), which take random uncorrelated values +1 or —1 for each 
integer x and y, then the bond of the percolation lattice with middlepoint at 
(x,y) is broken, if f(x)g(y) = —1; the bond is present, if f(x)g(y) = 1. 

We study the scaling of the length of the hull of a percolation cluster as 
a function of its diameter. To this end, we use Monte-Carlo simulations to 
estimate mathematical expectations of three quantities for such hull segments, 
which starts at the origin, and reach the edge of a square polygon of side length 
2N. These quantities are defined as follows: 

L\ — the length of the hull, measured in the number of surrounding lattice 
elements; 

L 2 — the number of such segments of the hull, which have three consecutive 
clockwise turns or three consecutive counter-clockwise turns (and hence, 
consist of four elements of the surrounding lattice); 

L 3 — the number of such segments of the hull, which have four alternating 
turns (left-right-left-right or right-left-right-left). 

The simulation results are again illustrated by two figures. In Fig, 3, the dif- 
ferential fractal dimension [defined as before, d c = log 2 (£ki/ £ki+i)], is plotted 
versus the polygon size y/N{N i+ i. 

Unlike in the case of "oceanic coastlines", here the polygon size can be dy- 
namically enlarged: we track a hull, and at the moment when it reaches the 
polygon boundaries at N — Ni, we record the data for Ni and enlarge the 
polygon size up to iV = iVj+i. We start with a new hull, if the hull forms a 
closed loop, or if Ni reaches the maximal allowable value. The benefit is that 
we can make the array of datapoints more dense, without spending additional 
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Fig. 3. The differential fractal dimension of the hull of the (1+1)D percolation 
clusters d c = \og 2 (Cki / Cki+i) is plotted versus the system size y/NiNi^[ in semilog- 
arithmic graph, similarly to Fig. 1. 

computing time. The drawback is that there will be correlations between the 
neighbouring datapoints, hence the statistical analysis will become more com- 
plicated. 

In order to take into account the correlations in a correct way, it would be 
necessary to perform a linear transform of the n-dimensional space of data- 
vectors, making the covariance matrix diagonal. Then, ordinary least-square fit 
could be performed in the new system of coordinates. However, as a simplified 
approach, one can calculate still the sum of squared residuals S(d), initially 
ignoring the correlations between the data points. By doing so, we keep the 
minima of the function S(d) in their correct places, i.e. the estimates for the 
scaling exponent values remain completely correct. However, our error analysis 
will be approximate, because the distribution of S(a fl ) will no longer be the 
chi-square one with n — m degrees of freedom. The effective number of degrees 
of freedom will be somewhat smaller: n needs to be substituted by an efficient 
number of uncorrelated datapoints n c g < n. The correlated data fluctuations 
contribute to the fluctuations of S(d) by enhancing each other. As a result, 
S(d) will have a chi-square distribution with n e s—m degrees of freedom, scaled 
by a factor of k = (n/n e s) 2 . The covariance matrix can be used to estimate 
the value of n eS . 

This approach has been used in Fig. 3, with n/n e fj = 3, where 
log 10 [fi:~ 1 5'((i)/(?T,efr — 4)] is plotted versus the exponent d, for different values 
of n. Different curves correspond to different number of skipped data points 
(the starting points of the respective data ranges are indicated in Fig. 3 by 
vertical lines A, B, and C). Insert provides a zoomed region around the sharp 
minimum for the curve labeled by B. These results allow us to conclude that 
«i = 1.56166 ± 0.00008, which is consistent with Fig. 3, but with increased 
precision. Also, we can conclude that a% = 0.66 ±0.03, and 0:3 = —0.07 ±0.14. 
It is most likely that in fact, a 3 = 0. Indeed, a 3 = corresponds to a constant 
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Fig. 4. Using the same simulation data as in Fig. 3, the logarithm of the sum of 
squared residuals is plotted versus d for different values of datapoints (the respective 
datapoint ranges are also marked in Fig 3.). For all the curves, three deep minima 
are present. 

offset of quantities which would appear immediately, if (for instance) L\ 
is measured without the first element (of the surrounding lattice). 



5 Conclusions 



A novel and universal method of determining the scaling exponents via finite- 
size Monte-Carlo simulations has been devised. The method can be applied, 
if it is possible to find m > 2 distinct quantities with equal scaling exponents. 
The two above considered examples provide general guidelines, how to define 
such quantities, and suggest that typically, these quantities can be indeed 
found. Here, we have used m = 3, which provides a good cancellation of 
higher order corrections to the asymptotic scaling law, and keeps the number 
of least-square fitting parameters reasonably small. 

For all the considered cases, the method increases the accuracy of the scaling 
exponent estimates. However, the method is particularly useful in the case of 
large corrections to the asymptotic scaling law [such as demonstrated in Fig. 1 
(H = 0)]. Also, the method is extremely useful, if it is necessary to find the 
exponents of the finite size correction terms. 

The support of Estonian Science F oundation grant No 6121 is acknowledged. 
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